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Abstract. Cointegration analysis is used to estimate the long-run equilibrium relations be¬ 
tween several time series. The coefficients of these long-run equilibrium relations are the 
cointegrating vectors. In this paper, we provide a sparse estimator of the cointegrating vec¬ 
tors. The estimation technique is sparse in the sense that some elements of the cointegrating 
vectors will be estimated as zero. For this purpose, we combine a penalized estimation pro¬ 
cedure for vector autoregressive models with sparse reduced rank regression. The sparse 
cointegration procedure achieves a higher estimation accuracy than the traditional Johansen 
cointegration approach in settings where the true cointegrating vectors have a sparse struc¬ 
ture, and/or when the sample size is low compared to the number of time series. We also 
discuss a criterion to determine the cointegration rank and we illustrate its good performance 
in several simulation settings. In a hrst empirical application we investigate whether the ex¬ 
pectations hypothesis of the term structure of interest rates, implying sparse cointegrating 
vectors, holds in practice. In a second empirical application we show that forecast perfor¬ 
mance in high-dimensional systems can be improved by sparsely estimating the cointegration 
relations. 
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1 Introduction 


High-dimensional datasets containing thousands of economic time series are commonly avail¬ 
able and accessible at reasonable cost (Stock and Watson, 2002 Clements and Galvao| 2008 


Fan et ah, 20111. The aim of this paper is to develop a cointegration technique for high¬ 


dimensional time series. In a cointegration analysis, long-run equilibrium relations, often 
implied by economic theory, are estimated. In hnancial economics, for instance, cointegra¬ 
tion analysis is used to investigate whether the expectations hypothesis of the term structure 
of interest rates (EHT) holds in practice. The Vector Error Correcting Model (VECM) (see 


e.g. Lutkepohl, 2007) is used to estimate and test for the cointegration relationships. Various 


approaches to test for cointegration are existing (see among others Engle and Granger, 1987 


Phillips and Ouliaris, [1990 ), among which the system cointegration test of Johansen (1988) 


has become most popular. 

The conventional Johansen system cointegration approach has, however, some limita¬ 
tions. In high-dimensional settings, where the number of time series is large compared to 
the sample size, the estimation imprecision will be large. Johansen’s approach is based on 
the estimation of a Vector AutoRegressive (VAR) model and a canonical correlation analy¬ 
sis. A drawback of the VAR model is that the number of parameters increases quadratically 
with the number of included time series. Consequently, regression parameters are estimated 
inaccurately if only a limited number of observations is available. When the number of time 
series exceeds the sample size, Johansen’s approach can not even be applied. 

In this paper, we introduce a penalized maximum likelihood approach to estimate the 
cointegrating vectors in a sparse way, i.e. some of its components are estimated as exactly 
zero. Sparse estimation techniques show good performance in various helds, such as, for 


instance, economics (e.g. Fan et ah, 2011), econometrics (e.g. Caner and Zhang, 2014), or 


macro-economics (e.g. Korobilis, 2013). A sparse cointegration approach is useful for several 
reasons. First, a sparse approach is justihed if economic theory implies sparsity in the 


cointegrating vectors (as is the case for the EHT, see e.g. Engsted and Tanggaard, 1994). 
Secondly, a sparse approach facilitates model interpretation since only a limited number 
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of variables, those corresponding to the non-zero coefficients, enter the estimated long-run 
equilibrium relations. Thirdly, sparsity improves forecast performance through variance 
reduction. Lastly, the sparse cointegration technique, in contrast to Johansen’s method, can 
be applied when the number of time series exceeds the sample size. We show in a simulation 
study that the sparse cointegration technique significantly outperforms Johansen’s approach 
when the cointegrating vectors have a sparse structure or when the number of time series is 
large compared to the sample size. 

We apply the sparse cointegration technique on a hnancial and macro-economic dataset. 
In the hrst empirical application, we investigate whether the expectations hypothesis of the 
term structure of interest rates (EHT) holds in practice. Previous research on the validity of 
the EHT reports evidence in support of the theory at the short end of the term structure (e.g. 


Hall et ah, 1992 Lasak and Velasco, 2014). The theory is generally rejected at the longer 


end (e.g. Shea, 1992 Zhang, 1993; Carstensen, 2003j ). We test the cointegration implications 
linked to the EHT for five US interest rates. Using the sparse cointegration technique, we 
hnd evidence in favor of the zero-sum restriction (i.e. for each cointegrating vector, the sum 
of the cointegration coefficients should be equal to zero). In a second empirical application, 
we use the VECM to perform a forecast exercise in a high-dimensional setting containing 
a large number of industrial production time series. We show that sparsely estimating the 
cointegrating vectors leads to an improvement in forecast performance. 

Cointegration analysis in high-dimensions has received little attention in previous re¬ 
search. Large Vector Autoregressive Models, containing a high number of time series rela¬ 
tive to the sample size, have been considered extensively. Common approaches are, among 


others. Dynamic Factor Models (e.g. Stock and Watson, 2002), Bayesian VAR Models (e.g. 


Banbura et ah, 2010) or Reduced-Rank VAR Models (e.g. Carriero et ah, 2011 Bernardini 


and Cubadda, 2014). Typically, these authors do not account for cointegration. Instead, 


the time series are either transformed in order to achieve stationarity (e.g. Bernardini and 


Cubadda, 2014) or the (non)-stationarity of the time series is accounted for in the prior 


distribution of the autoregressive parameters (e.g. Banbura et ah, 2010). Few studies, e.g. 
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Strachan (2003) or Koop et al. (2006), do account for cointegration by using a Bayesian 


method for obtaining estimates of the cointegrating vectors. These Bayesian approaches, 
in contrast to the sparse cointegration approach discussed in this paper, do not perform 
variable selection and require prior specihcation. 

The remainder of this article is structured as follows. We describe the sparse cointegration 
method in Section 2. Section 3 provides more details on the algorithm. Section 4 discusses the 
Rank Selection Criterion ( Bunea et al.| 2011) to determine the cointegration rank. Section 
5 presents the results of a simulation study. Section 6 discusses the findings on the empirical 
applications. Finally, Section 7 concludes. 


2 Penalized Maximum Likelihood 


Let yt be a g-dimensional multivariate time series, where yt is I (1). We assume that yt follows 
a VAR(p) model. Any order VAR model can be re-written in vector error correcting 


(VECM) representation (Hamilton, 1991) as follows 


p-i 

Ayt = ^riAyi_i + nyt_i + £i, t = l,...,T (1) 

i=l 

where Fi,..., rp_i are q x q matrices containing short-run effects, 11 is a g x g matrix of 
rank r, 0 < r < g and St is assumed to follow a Nq{0, S). 

If we can express Yl = cx(3' with ck and (3 q x r matrices of full column rank r, with 
0 < r < g, then the linear combinations given by (3'yt are stationary and yt is said to be 
cointegrated with cointegration rank r. The cointegrating vectors are the columns of (3 and 
the adjustment coefficients the elements of a. 

We estimate the model parameters in a penalized maximum likelihood framework. It is 
convenient to rewrite model ([^ in matrix notation: 


Y = xr + zn' + E 


( 2 ) 


where Y = (Ay^+i,..., Ay^)'; X = (AXp+i,..., AX^)' with X* = (Ay^.^,..., Ay^.p+J'; 
Z = (yp,..., y-T-i)'; F = (Fi,..., Fp_i)'; and E = (Sp+i,..., Et)'■ Consider the penalized 
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negative log-likelihood, given by 


cir,n,fi) = -tr(^(Y-xr-zn')fi(Y-xr-zn')'j-iog|f2|+AiPi(/3)+A2P2(r)+A3P3(f^), 

( 3 ) 

with tr(-) denoting the trace, f 2 = and Pi, P 2 and P 3 three penalty functions. 

We use Li or Lasso penalization (Tibshirani, 1996| on the cointegrating vectors 


q r 


i=l j=l 


( 4 ) 


As an extension, we also consider the Adaptive Lasso (|Zou 2006) 


q r 


PiiP) = 

i=l j=l 


( 5 ) 


with weights Wij. The weights Wij are computed as the inverse of the Lasso solution Wij = 
for 7 ^ 0. The Adaptive Lasso enjoys the oracle property (consistent for variable 


selection), whereas the Lasso does not (Zou, 2006). 


For the short-run effects F, we use L 2 or Ridge penalization (Hoerl and Kennard, [1970 ) 

( 6 ) 


q q p-l 

i=l j=l k=l 


with the {i,j)th element of Ffc. The Li penalty shrinks parameter estimates towards 
zero and sets some to exactly zero. Contrary to the Li penalty, the L 2 penalty only shrinks 
parameter estimates towards zero. We use an L 2 penalty for the short-run effects since this 
is computationally less expensive and we only require sparsity in the cointegrating vectors. 
Note that using the ridge penalty, estimation remains feasible if the number of time series 
exceeds the sample size. 

Finally, we use Li penalization for the off-diagonal elements of the inverse of the error 
covariance matrix, f 2 . 


p,{n) = j2\^kk'\- 


( 7 ) 


k^kf 


The aim is to select F, FI, f2 so as to minimize (|^ subject to the constraint 


FI = (y.(3\ 
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with CK and (3 q x r matrices of full column rank r. The matrices ck and (3 are not uniquely 
dehned. Section 3 provides more details on the normalization conditions we impose. For the 
unpenalized case (i.e. Ai = 0 , A 2 = 0 and A 3 = 0 ), the objective function (|^ boils down 


to the one introduced by Johansen (1988). The unpenalized case can be solved either by 


using an iterative algorithm or by using the closed-form expressions documented in Johansen 


(1988). 


3 Algorithm 

To hnd the minimum of the penalized negative log-likelihood in (|^, we iteratively solve for 
r conditional on 11, f2; for 11 conditional on F, f2; and for f2 conditional on F, FI. 


Solving for F conditional on FI, 17. When FI and 17 are hxed, the minimization problem 
in (|^ is equivalent to minimizing 


f |n, 17 = argmin ^tr(^(Y - Zn' - XF)17(Y - Zn' - XF)') + A 2 P 2 (r). (8) 


ridge penalty, as given in equation (|^. The closed-form expression for the estimated short- 
run dynamics F is given by 


The above minimization problem is a penalized multivariate regression (see e.g. Rothman 


et ah, 2010) of (Y — ZFI') on X. We solve this penalized multivariate regression using the 


p — ^^ridge/^ridge _|_ 1 ^ridge/^ridge 


with 

fridge _ ^^ 1/2 ^ - zn'), 

where the latter is a vector of length nq containing the stacked values of the time series given 
in the columns of the matrix (Y — ZH'), and 

Xridge ^ (f2l/2^I^)(I^0Z), 


where 0 stands for the kronecker product. 
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Solving for 11 conditional on F, f2. When F and f2 are fixed, the minimization problem 
in is equivalent to 


(A,/3)|FO = argmin -tr (Y - XF - Z/3(a')ri(Y - XF - Z/3(a')' + AiPi(/3). (9) 

a,/3 TV / 

The above minimization problem boils down to a penalized reduced rank regression (e.g. 


Chen and Huang, 2012). For identifiability purposes, we impose the normalization conditions 


(x'flot = If. We first estimate ck conditional on f3, next we estimate (3 conditional on a. 
For fixed f3, the minimization problem in (|^ reduces to 

CK|F,ri,/3 = argmin ;^tr((Y-XF-Z/3a')ri(Y-XF-Z/3a')') st. a'ria = Ir, (10) 

rv ^ ' 


which is a weighted Procrustes problem (Lissitz et al., 1976). This weighted Procrustes 


problem for ot can be seen as an unweighted Procrustes problem for ot* = The 

solution is 

d = fi-i/Vc/', 

where U and V are obtained from the singular value decomposition of 

/3'Z'(Y - XF)ri^/^ = UDV'. 


Note that Chen and Huang (2012) only consider the case where = I, and use a Procrustes 


problem to solve for ck. A weighted Procrustes problem takes the covariance structure into 
account. 


For fixed ck, the minimization problem in ([^ reduces to 

/3|Ffi,a = argmin ;^trf(Y - XF - Z/3a')f^(Y - XF - Z/3a')') + AiPi(/3). (11) 

Since = Ir, there exists a matrix o:*-*- with orthonormal columns such that (q:*,q:*-*“) 

is an orthogonal matrix. Then, with Y = Y — XF, 

tr(^(Y-Z/3cK')ri(Y-Z/3a')') = \\{Y - 

= ||(Yri^/^ - z/3a*')ir 

= ||(Yri^/^ - Z/3cK*')(a*,CK*^)||2 
= ||(Yri^/^a* - Z/3|p + ||(Yri^/^a*^)||2. 
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Since the second term on the left-hand-side does not involve /3, the minimization problem 
reduces to 

$\rn,a = argmin - Z(3)(Yn^/^a* - Z/3)') + AiPi(/3). (12) 

The above minimization problem is a penalized multivariate regression of on 

Z. We consider both a Lasso penalty, as in equation (|^, and an Adaptive Lasso penalty, as 
in equation ([^. 


Solving for Cl conditional on F, 11. When F and FI are hxed, the minimization problem 
in (|^ is equivalent to minimizing 


n\r, n = argmin -tr (Y - ZH' - XF)ri(Y - Zn' - XF)' 
n T 


log|f2| -|- X^P^^Cl). (13) 


The above minimization problem corresponds to penalized covariance estimation. With the 
penalty term as given in equation ([^, this problem can be solved using the glasso algorithm 
of Friedman et al.| (2008). 


We iterate solving minimization problem ([^, ([^ and (13) until the angle between the esti¬ 
mated cointegration space in two successive iterations is smaller than some tolerance value 
e (e.g. e = 10 “^). 


Selection of tuning parameters. We select the tuning parameters Ai, controlling the penaliza¬ 
tion on the cointegrating vectors, and A 2 , controlling the penalization of the short-run effects. 


according to a time series cross-validation approach (Hyndman, 2014), see Appendix B. Since 
the sparseness structure of each cointegrating vector can be different, we allow the selected 
sparsity parameter Ai to be different for each cointegrating vector. The tuning parameter 
A 3 , controlling the penalization on the off-diagonal elements of 17, is selected according to 


the Bayesian Information Criterion (Friedman et ah, 2008). 


Starting values. A starting value for ck, (3 and Cl is required. We start with f2 = Iq. Starting 
values for a. and (3 are obtained by hrst applying the iterative algorithm with an L 2 penalty 











on the cointegrating vectors, initialized by taking every component of /3 equal to one and r^. 
(for A; = 1 ,... — 1 ) the identity matrices. 

Unpenalized objective function. The unpenalized case (i.e. Ai = 0, A 2 = 0 and A 3 = 0) 
can also be solved using the iterative algorithm. We numerically verihed that this iterative 
procedure and Johansen’s closed-form solution yield almost identical results, justifying the 
use of our iterative procedure to solve objective function ( 1 ). 


4 Determination of Cointegration Rank 


At small hnite samples, the asymptotic distribution of Johansen’s trace statistic, used to 
determine the cointegration rank, might poorly approximate the true distribution, resulting 
in substantial size and power distortions (e.g. Johansen, 2002 Nielsen, 2004 p^selius , 2006 


Breitung and Cubadda, 2011). We use an iterative procedure based on the Rank Selection 


Criterion (RSC) of Bunea et ah (2011) to determine the cointegration rank r. We start with 


an initial value of the cointegration rank rgtart = Q- 

For this initial value, we hrst obtain F, using the algorithm discussed in Section 3. In 


a second step, we update our estimate of the cointegration rank. Following Bunea et ah 


(2011), r is given by the number of eigenvalues of the matrix Y'PY that exceed a certain 


threshold /r: 

f = max{r : Ar(Y'PY) > p}, 

with Y = Y — Xr and P = Z(Z'Z)^Z' the projection matrix onto the column space of 


Z. Following the recommendation of Bunea et ah (2011), the threshold is set equal to 
p, = 2S‘^{q + /), with I = rank(Z) and 

||Y-PY||| 

Tq-lq ' 

where || ■ ||f denotes the Frobenius norm for a matrix. We repeat the above procedure using 
the new value of f, this until the estimate of the cointegration rank does not change in two 
successive iterations. 
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The Rank Selection Criterion provides a consistent estimate of the effective rank of the 


coefficient matrix 11 in the penalized reduced rank regression (Bunea et ah, 2011). The 


consistency results are valid when either the length of the time series or the number of time 
series grows to inhnity. This procedure to determine the rank has almost no computational 
cost and can also be used when the number of time series is larger than the sample size. 

5 Simulation Studies 

We conduct a simulation study to evaluate the performance of the penalized ML estimator. 


The considered data generating process (revised from Cavaliere et al., 2012) is the following 
VECM: 

^yt = OLf3'yt-i + Ti^yt-i + et, (t = i,... ,T), 

where the error terms follow a Nq{0, Iq) distribution. We set yo = Ayo = 0. We compare 
the precision accuracy of the penalized ML algorithm to the maximum likelihood procedure 


of Johansen (1988) 


5.1 Simulation designs. Different simulation designs are considered: (i) low-dimensional 
designs (T = 500, g = 4), and (ii) high-dimensional designs with moderate sample size 
(T = 50,g = 110 For each design, we consider both sparse and non-sparse simulation 
settings. Full details on each simulation design can be found in Appendix A. The number of 
simulations for each setting is M = 500. 

Low-dimensional designs. The true cointegrating vectors are sparse in the hrst two sim¬ 
ulation settings. The cointegration rank equals r = 1, r = 2 respectively. In the third 
simulation setting, the true cointegrating vector is non-sparse and r = 1. 

High-dimensional designs. In the hrst two simulation settings, the true cointegrating 
vectors are sparse. The cointegration rank equals r = 1, r = 4 respectively. In the third 


= 11 time series is the largest number for which the critical values of Johansen’s trace statistic are 


tabulated in Johansen (Chapter 15; 19961 or Osterwald-Lenum (1992). Note that Doornik (1998) provide a 


response surface approximation to the critical values tabulated by Johansen for q up to at least 15. 
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simulation setting, the true cointegrating vector is non-sparse and r = 1. 


5.2 Performance measures. To evaluate the estimation accuracy, we compute for each 
simulation run m, with m = 1,..., M, the angle f3) between the estimated coin¬ 

tegration space and the true cointegration space. The average angle is then given by 

1 ^ 

= (14) 

m=l 


Furthermore, we evaluate the performance of the Rank Selection Criterion to the trace 


statistic of Johansen (1988), the Bartlett-corrected trace statistic (Johansen, 2002) and the 


bootstrap procedure of Cavaliere et ah (2012) in correctly selecting the true cointegration 
rankJu The Bartlett-corrected trace statistic (Johansen, 2002) and bootstrap procedure are 


used to improve the small sample performance of Johansen’s trace statistic. For each method, 
we record the relative frequencies, over all simulation runs, of the selected ranks. 


5.3 Results for the low-dimensional designs. The simulation results on the accuracy of 
the estimated cointegration space are reported in Table [Tj For different values of the adjust¬ 
ment coefficients, we report the average angle (averaged across simulation runs) between the 
estimated and the true cointegration space. We use a two-sided paired t-test to test equality 
of the average angle of the sparse estimation method and of Johansen’s method. 

In the sparse settings, the sparse methods are the best performing. They provide sig¬ 
nificantly more precise estimates than the Johansen procedure. For almost all values of the 
adjustment coefficients, the estimation accuracy of the sparse methods is even twice as good 
as that of Johansen’s method. The Sparse Adaptive Lasso provides a more precise estimate 
of the cointegration space than the Sparse Lasso. In the non-sparse setting, Johansen’s 
method is best performing for low values of the adjustment coefficients. For higher values of 
a, however, all methods show similar performance. The usage of the sparse procedures does 
not lead to an lower estimation precision here. 

Table [^reports the results on the determination of the cointegration rank. For reasons of 
brevity, we only report the results for a = —0.4 and a = —0.8. In the first sparse design, the 
^All tests are conducted at the 5% significance level. 
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Table 1: Low-dimensional designs: T = 500, g = 4. Average angle between the estimated and 
true cointegration space. The results are reported for different values of the adjustment coefficient 
a. Significant differences, at the 5% significance level, between the sparse method and Johansen’s 
method are in bold._ 


Method \ a 

-0.2 

-0.4 

-0.6 

-0.8 

Sparse a and /3,r = 1 

Johansen 

0.060 

0.032 

0.020 

0.015 

Sparse Lasso 

0.034 

0.018 

0.012 

0.009 

Sparse Adaptive Lasso 

0.011 

0.004 

0.003 

0.002 

Sparse a and /3, r = 2 

Johansen 

0.013 

0.006 

0.004 

0.003 

Sparse Lasso 

0.007 

0.003 

0.002 

0.002 

Sparse Adaptive Lasso 

0.001 

0.001 

0.001 

0.001 

Non-sparse a and /3, r = 1 

Johansen 

0.026 

0.013 

0.009 

0.007 

Sparse Lasso 

0.073 

0.013 

0.009 

0.007 

Sparse Adaptive Lasso 

0.077 

0.014 

0.009 

0.007 
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Table 2: Low-dimensional designs: T = 500, g = 4. Frequency of the estimated cointegration rank 
f = 0,..., g using Johansen’s trace statistic, the Bartlett-corrected trace statistic, the bootstrap of 


Cavaliere et al. (2012) and the Rank Selection Criterion (RSC). 


True rank 

Method \ f 

0 

1 

2 

3 

4 

0 

1 

2 

3 

4 

Sparse a 

and d 


a = -0.4 



a — —0.8 


r = 1 

Johansen 

0.0 

95.8 

3.8 

0.4 

0.0 

0.0 

95.8 

4.0 

0.2 

0.0 


Bartlett 

0.0 

95.0 

4.2 

0.8 

0.0 

0.0 

95.4 

4.0 

0.6 

0.0 


Bootstrap 

0.0 

96.2 

3.4 

0.4 

0.0 

0.0 

96.0 

3.8 

0.2 

0.0 


RSC 

0.0 

91.0 

9.0 

0.0 

0.0 

0.0 

91.6 

8.4 

0.0 

0.0 

Sparse a 

and (3 


a = -0.4 



a — —0.8 


r = 2 

Johansen 

0.0 

0.0 

96.4 

3.6 

0.0 

0.0 

0.0 

96.0 

3.8 

0.2 


Bartlett 

0.0 

0.0 

93.0 

6.8 

0.2 

0.0 

0.0 

95.2 

4.8 

0.0 


Bootstrap 

0.0 

0.0 

96.0 

3.6 

0.4 

0.0 

0.0 

95.4 

4.2 

0.4 


RSC 

0.0 

0.0 

99.6 

0.4 

0.0 

0.0 

0.0 

99.8 

0.2 

0.0 

Non-sparse a and (3 


a = -0.4 



a — —0.8 


r = 1 

Johansen 

0.0 

94.6 

5.0 

0.4 

0.0 

0.0 

95.6 

3.6 

0.8 

0.0 


Bartlett 

0.0 

95.6 

4.0 

0.4 

0.0 

0.0 

95.6 

3.8 

0.6 

0.0 


Bootstrap 

0.0 

95.6 

4.2 

0.2 

0.0 

0.0 

96.0 

3.4 

0.6 

0.0 


RSC 

0.0 

90.4 

9.6 

0.0 

0.0 

0.0 

91.4 

8.6 

0.0 

0.0 


Rank Selection Criterion achieves competitive performance with a rank recovery percentage 
around 91%. Note that Johansen’s method is aimed at controlling size, resulting in a rank 
recovery percentage around 95% when working with a 5% signihcance level. In the second 
sparse design, RSC is the best performing method. It correctly selects the cointegration rank 
in almost all simulation runs. In the non-sparse design, Johansen’s procedure performs best. 
The rank recovery percentage of RSC remains close to that of Johansen’s trace statistic. 

5.4 Results for the high-dimensional designs. In these designs, we expect that the 
advantage of using the sparse procedures becomes much larger. The sample size is small 
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Table 3: High-dimensional designs: T = 50, q = 11. Average angle between the estimated and 
true cointegration space. Results are reported for different values of the adjustment coefficient 
a. Signihcant differences, at the 5% signihcance level, between the sparse method and Johansen’s 
method are in bold._ 


Method \ a 

-0.2 

-0.4 

-0.6 

-0.8 

Sparse a and /3, r = 1 

Johansen 

1.203 

1.025 

0.825 

0.672 

Sparse Lasso 

0.791 

0.396 

0.228 

0.099 

Sparse Adaptive Lasso 

0.816 

0.392 

0.209 

0.090 

Sparse a and P,r = 4 

Johansen 

0.184 

0.101 

0.064 

0.047 

Sparse Lasso 

0.154 

0.076 

0.047 

0.034 

Sparse Adaptive Lasso 

0.152 

0.070 

0.042 

0.033 

Non-sparse a and /3, r = 1 

Johansen 

1.203 

1.005 

0.810 

0.656 

Sparse Lasso 

0.730 

0.384 

0.250 

0.161 

Sparse Adaptive Lasso 

0.758 

0.403 

0.266 

0.174 


compared to the number of time series, such that the estimation imprecision when using 
Johansen’s approach will become large. The simulation results on the estimation accuracy 
of the estimated cointegration space are reported in Table In all settings, the sparse 
procedures indeed signihcantly outperform Johansen’s procedure. Also for the non-sparse 
design the sparse estimation procedures perform best. The differences are outspoken. Since 
the Lasso and Adaptive Lasso perform regularization, their good performance is retained 
in non-sparse high-dimensional settings. Furthermore, the Sparse Lasso and the Sparse 
Adaptive Lasso show similar performance. 

Table reports the results on the determination of the cointegration rank. In all simu¬ 
lation designs, the Rank Selection Criterion does much better than its alternatives. In the 
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Table 4: High-dimensional designs: T = 50, q = 11. Frequency of the estimated cointegration rank 




True rank 

Method \ r 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

Sparse ot 

r = 1 

and (3 

Johansen 

0.0 

0.0 

0.0 

0.0 

0.0 

0.2 

a = -0.4 

3.8 

6.6 

22.2 

25.4 

25.4 

16.4 


Bartlett 

0.0 

8.0 

13.6 

18.6 

16.6 

10.6 

10.2 

5.8 

6.6 

5.6 

4.0 

0.4 


Bootstrap 

89.6 

9.2 

1.2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 


RSC 

3.8 

95.2 

1.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

r = 1 

Johansen 

0.0 

0.0 

0.0 

0.0 

0.0 

0.2 

a = —0.8 

3.6 

9.2 

22.6 

24.0 

23.4 

17.0 


Bartlett 

0.0 

2.6 

13.4 

23.6 

14.6 

11.8 

13.8 

7.2 

5.2 

4.8 

3.0 

0.0 


Bootstrap 

83.2 

15.0 

1.6 

0.2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 


RSC 

0.0 

94.6 

5.4 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

Sparse ot 

r = 4 

and (3 

Johansen 

0.0 

0.0 

0.0 

0.0 

0.4 

2.8 

a = -0.4 

24.8 

22.6 

27.2 

12.6 

5.8 

3.8 


Bartlett 

0.0 

1.4 

7.2 

13.4 

18.8 

11.6 

16.6 

8.6 

9.4 

8.0 

4.6 

0.4 


Bootstrap 

75.2 

21.6 

3.0 

0.2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 


RSC 

1.8 

15.0 

30.6 

37.2 

15.4 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

r = 4 

Johansen 

0.0 

0.0 

0.0 

0.0 

0.4 

1.8 

a = —0.8 

22.4 

28.6 

28.0 

11.2 

5.2 

2.4 


Bartlett 

0.0 

0.0 

1.8 

9.2 

19.8 

16.4 

18.8 

10.8 

10.4 

9.0 

3.6 

0.2 


Bootstrap 

21.6 

44.4 

25.6 

6.8 

1.4 

0.0 

0.0 

0.2 

0.0 

0.0 

0.0 

0.0 


RSC 

0.0 

0.0 

0.0 

3.2 

96.8 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

Non-sparse o: and /3 

r = 1 Johansen 

0.0 

0.0 

0.0 

0.0 

0.0 

0.2 

a = -0.4 

2.6 

8.4 

21.6 

26.8 

25.0 

15.4 


Bartlett 

0.0 

5.4 

18.0 

17.6 

17.6 

12.6 

9.8 

5.6 

5.2 

6.4 

1.8 

0.0 


Bootstrap 

88.6 

9.6 

1.6 

0.2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 


RSC 

6.4 

92.6 

1.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

r = 1 

Johansen 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

a = —0.8 

2.4 

8.6 

21.0 

25.6 

24.0 

18.4 


Bartlett 

0.0 

2.0 

16.8 

19.8 

18.4 

12.4 

10.4 

4.8 

6.4 

4.6 

4.4 

0.0 


Bootstrap 

80.4 

17.4 

2.0 

0.2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 


RSC 

0.0 

96.0 

4.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 
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first design with a = —0.8, for instance, RSC estimates the cointegration rank correctly in 
94.6% of the simnlation rnns, the Bartlett-corrected trace statistic in 2.6%, the bootstrap 
in 15.0% and Johansen’s trace statistic in 0% of the simnlation rnns. Dne to the severe size 
distortions in this small sample size design, the rank recovery percentage of Johansen’s trace 
statistic does not improve when working with a signihcance level of, for instance, 1%. 

When the trne cointegration rank becomes higher (r = 4 in the second design), the 
performance of the Rank Selection Criterion becomes sensitive to the strength of the cointe¬ 
gration signal: its rank recovery percentage increases from 15.4% for a = —0.4 to 96.8% for 
a = —0.8. However, even then, RSC is still the best performing method. 


6 Application 

We consider two empirical applications. In the hrst application on interest rates, economic 
theory implies sparsity in the cointegrating vectors. Therefore, it is appealing to use the 
sparse cointegration technique even though standard results from Johansen’s system cointe¬ 
gration test are also available. Secondly, we perform a forecasting exercise in a large VECM 
of industrial production time series. 


6.1 The term structure of interest rates. We use the sparse cointegration approach to 
investigate whether the expectations hypothesis of the term structure of interest rates (EHT) 
holds in practice. The EHT implies that the long-term interest rate can be expressed as an 
average of current and market-expected future short-term interest rates plus a constant risk 
premium: 


T—1 


+ C, 


(15) 


i=0 


where rj and rj are the r-period and one-period interest rates, C is a constant term pre¬ 
mium and Et is the expectations operator conditional on public information at time t (e.g. 


Lanne, 2000). We consider q interest rates .. ,r^‘* with increasing time to maturity 

1, r 2 ,..., Tq. Then equation (15) holds for all pairs of interest rates {rj, {rj, r^^}, ..., 
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{rl, } and we can write 


r 


T 

t 



1 

r 


5^5^E,Ar,‘+, + G. 


i=l j=l 


(16) 


with Since the interest rates are assumed to be /(1), the first differences 


t+3 ' t+3 


are stationary and, hence, the right-hand-side of equation (16) is stationary. This implies that 


the left-hand-side of equation (16) must be stationary as well. There are two cointegration 
implications linked to the EHT. Firstly, there should be g — 1 cointegrating vectors in a 
system with q interest rates of different maturity; or equivalently, one common trend (with 
the number of common trends = q — r). Secondly, the g — 1 yield spreads between the 
one-period interest rate and each n-period interest rate span the cointegration space: 


1 1 ... 1 

-1 0 ... 0 

0 -1 ... 0 

0 0 ... -1 


(17) 


For each cointegrating vector, the sum of the cointegration coefficients should be equal to 
zero (“zero-sum restriction”). Rejection of one of both implications would be considered as 
evidence against the FHT. 

We collect monthly data on Eve US treasury bills with different time to maturity (1, 3, 5, 
7 and 10 years), ranging from July 1969 until February 2014 (source: Federal Reserve, United 
States). Time plots on the interest rates in levels, in first differences and the spreads are 
presented in Figure [Tj A stationarity test of all individual interest rates using the Augmented 
Dickey-Fuller test confirms that the time series are integrated of order 1. 


Cointegration Rank. We estimate the cointegration rank using Johansen’s trace statistic 
and the Rank Selection Criterion discussed in Section 4. Table reports the results on 
the estimated cointegration rank. For the system with two interest rates (2-IR system), 
both procedures estimate the cointegration rank to be one, the number implied by the 
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Figure 1: Time plot of the interest rates (US treasury bills, constant maturity: 1-year; 3-year; 
5-year; 7-year and 10-year, in % per annum) in levels, in first differences and the spreads. 
Period January 1962 to February 2014. 


18 















Table 5: Estimated cointegration rank using Johansen’s trace statistic and the Rank Selection 
Criterion for the four interest rate systems. The last column reports the cointegration rank 
implied by the expectations hypothesis. 


Interest Rate System 

Estimated Cointegration Rank^ 

Johansen Rank Selection Criterion 

Expectations 

Hypothesis 

2-IR system: lY, 2Y 

f = 1 

f = 1 

r = 1 

3-IR system: lY, 2Y, 5Y 

f = 1 

f = 1 

r = 2 

4-IR system: lY, 2Y, 5Y, 7Y 

r = 2 

f = 2 

r = 3 

5-IR system: lY, 2Y, 5Y, 7Y, lOY 

f = 3 

f = 2 

r = 4 


^ Using the Bartlett-corrected trace statistic or the Bootstrap of Cavaliere et al. (20121, we obtain 
the same results as for Johansen’s trace statistic. 


expectations hypothesis. For the other interest rate systems, the estimated cointegration 
rank is lower than implied by the expectations hypothesis. In the 3-IR system, for instance, 
both procedures underestimate the cointegration rank implied by the theory (i.e. r = 2) 
by one (i.e. f = 1). Empirical evidence for more than one common trend is also found 


by other researchers. Carstensen (2003) and Zhang (1993), for instance, find up to three 


common trends when interest rates of longer maturity our included. Giese (2008) find strong 
evidence for two common trends. 


Zero-sum restriction. We test the null-hypothesis 

= 0(q_l)xl, (18) 


Hn:& = 


9 i 62 




with 6j = XlLi (j ~ • • • 5 ? ~ 1) the sum of the coefficients of the jth cointegrating 

vector. Note that the zero-sum restriction implies the cointegration space to be perpendicular 
to the unit vector. Therefore, every basisvector of the cointegration space needs to be 
perpendicular to the unit vector. 

We set the cointegration rank r = g—1, the number implied by the EHT, and estimate the 
cointegration space using Johansen’s ML procedure or the sparse penalized ML procedure 
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resulting in an estimate 0. To test the zero-sum restriction, we bootstrap the Wald test 
statistic Q = ©'Cov“^(©)©. Details on the bootstrap procedure can be found in Appendix 

C. 

Results are in Table Johansen’s procedure reports mixed evidence. The zero-sum 
restriction is rejected for the 3-, 4-, and 5-IR system (p-values < 0.05), but not for the 2-IR 
system (p-value > 0.05). Estimating the cointegrating vectors with a sparse estimator, the 
zero-sum restriction is not rejected (for all interest rate systems p-values > 0.05 ), conhrming 
the EHT. Finally, note that many coefficients of the cointegrating vectors are estimated as 
exactly zero using the sparse penalized ML. This improves interpretability of the estimation 
results. 


6.2 Forecasting industrial production in a large VECM. We consider a large 
VECM with q = 24 industrial production time series related to manufacturing, ranging from 
January 1972 until January 2014. We use an updatec^ version of the Stock and Watson 


(2002) database (source: Federal Reserve, United States). A short description of each time 
series can be found in Table Appendix D. A stationarity test of all individual industrial 
production time series using the Augmented Dickey-Fuller test indicates that the time series 
are integrated of order one, making it appropriate to test for cointegration. 

We use the Rank Selection Criterion from Section 4 to determine the cointegration rank 
since it performs much better than its alternatives in the high-dimensional simulation settings 
of Section 5. The Rank Selection Criterion indicates that the industrial production time 
series are cointegrated with 1 cointegration equation. The sparse method includes the time 
series wood, prim metal, machinery, electrical , food and non-naics in the cointegration 
equation as their associated coefficients are estimated as non-zero. 

We compare the forecast performance of the Sparse penalized ML estimator (with Lasso 
penalty) to Johansen’s ML. We estimate a VECM(l) model with one cointegration relation 
for the 24 industrial production time series. We take the order of the VECM to be one, as 
indicated by both the Bayesian Information Criterion and the Akaike Information Criterion. 


^We extend the time range (until February 2014) and we add more industrial production time series. 
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Table 6: Testing the zero-sum restriction for each interest rate system using Johansen’s ML 
and Sparse penalized ML with Lasso penalty. P-values are reported below every sum. 


Variables 

$1 

Johansen ML 

Cointegrating vectors 

02 03 

0i 

Sparse Lasso 

Cointegrating vectors 

01 02 03 

0i 

2-IR system 









lY 

1.00 




1.00 




2Y 

-1.01 




-0.95 




sum 

-0.01 




0.05 





p = 0.91 




p = 0.71 




3-IR system 









lY 

1.00 

1.00 



1.00 

1.00 



2Y 

-2.41 

-8.76 



-1.52 

0 



5Y 

1.47 

8.19 



0.56 

-0.87 



sum 

0.06 

0 . 4.3 



0.04 

0.13 




p < 0.01 




p = 0.44 




4-IR system 









lY 

1.00 

1.00 

1.00 


1.00 

1.00 

1.00 


2Y 

-19.78 

-2.00 

-6.72 


-1.24 

-0.97 

-1.05 


5Y 

48.29 

0.43 

4.30 


0 

0 

0.06 


7Y 

-29.78 

0.64 

1.79 


0.27 

0 

0.04 


sum 

-0.27 

0.07 

0.37 


0.03 

0.03 

0.05 



p < 0.01 




p = 0.62 




5-IR system 









lY 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

2Y 

-3.23 

37.22 

0.46 

-4.11 

-1.34 

-1.07 

-1.02 

-0.80 

5Y 

1.31 

-113.98 

0.24 

3.51 

0 

-0.11 

0 

0 

7Y 

3.61 

86.16 

-6.65 

-3.70 

0 

0 

0 

0 

lOY 

-2.66 

-9.67 

5.02 

3.61 

0.35 

0.19 

0 

-0.12 

sum 

0.03 

p < 0.01 

0.73 

0.07 

0.31 

0.01 

p = 0.11 

0.01 

-0.02 

0.08 
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Note that we have included an intercept in the VECM of equation Q since some of the 
industrial production time series exhibit a drift. Using a rolling window of 4 years (hence, 
S = 48), the VECM is re-estimated at each time point t = S,... ,T — 1 and 1-step-ahead 
forecasts yt+i = ..., Ut+h computed. For each of the 24 time series {i = 1... ,q = 

24), we compute the Mean Absolute Forecast Error (MAFE). 

MAFE = ^ I sS, - |, (19) 

t=S 

Table reports the results for the two forecast methods. 

Averaged across the 24 time series, the sparse estimation procedure achieves the best 
forecast performance. Its forecast performance is almost two times better than that of Jo¬ 
hansen’s ML (i.e. MAFE of 2.62 against 4.66). Also for 23 out of 24 industrial production 
time series, the MAFE of the Sparse Lasso is lower than the MAFE of Johansen’s ML. A 
Diebold-Mariano test conhrms that the forecast performance of the Sparse Lasso is signih- 
cantly, at the 5% signihcance level, better than Johansen’s ML, for 15 industrial production 
time series. In sum, we show that, in this high-dimensional application, sparsely estimat¬ 
ing the cointegrating vector substantially improves the forecast performance compared to 
Johansen’s approach. 


7 Conclusion 


In this paper, we discuss a sparse cointegration technique. Our simulation study shows 
that the sparse cointegration technique signihcantly outperforms Johansen’s ML procedure, 
when the true cointegrating vectors are sparse or when the sample size is low compared to 


the number of time series. We use the Rank Selection Criterion of Bunea et ah (2011) to 


determine the cointegration rank. In high-dimensional simulation settings, the Rank Selec¬ 
tion Criterion outperforms Johansen’s trace statistic, the Bartlett-corrected trace statistic 


and the bootstrap procedure of Cavaliere et ah (2012). 


Sparsity might be useful for several reasons. First, when the underlying structure of the 
cointegrating vectors is known to be sparse, a sparse cointegration technique allows to explic- 
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Table 7: Mean absolute forecast error (MATE) for the g = 24 industrial production time 
series and the fwo forecast methods: Sparse penalized ML with Lasso penalty and Jo¬ 
hansen’s ML of the g—variate VECM with one cointegration relation. P-values of the 
Diebold-Mariano test are reported in the last column. 


Time Series 

Sparse Lasso 

Johansen ML 

P—value Diebold-Mariano test 

TOT 

1.58 

1.74 

0.54 

NAICS 

1.58 

1.74 

0.52 

DURABLE 

1.60 

1.86 

0.34 

WOOD 

3.27 

6.56 

< 0.01 

NONMETAL 

2.83 

4.78 

< 0.01 

PRIM METAL 

4.21 

9.99 

< 0.01 

EABR METAL 

1.94 

2.23 

0.41 

MACHINERY 

3.25 

4.61 

0.04 

COMPUTER 

1.14 

0.99 

0.47 

ELECRICAL 

2.95 

5.03 

< 0.01 

MOTOR 

4.40 

11.51 

< 0.01 

AEROSPACE 

3.31 

5.22 

0.01 

EURNITURE 

2.91 

3.79 

0.13 

OTHER DURABLE 

1.59 

2.21 

0.02 

NONDURABLE 

1.61 

2.22 

0.06 

FOOD 

1.62 

3.61 

< 0.01 

TEXTILE 

3.50 

7.41 

< 0.01 

APPAREL 

4.88 

11.69 

< 0.01 

PAPER 

2.38 

6.02 

< 0.01 

PRINT 

3.00 

3.35 

0.56 

PETROLEUM 

2.30 

5.59 

< 0.01 

CHEMICAL 

1.89 

2.71 

0.04 

PLASTIC 

2.45 

3.26 

0.04 

NON-NAICS 

2.78 

3.70 

0.09 

Total 

2.62 

4.66 

< 0.01 
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itly capture this sparseness. We illustrate this with the expectations hypothesis. Secondly, in 
high-dimensional settings with cointegrated time series, estimating the cointegrating vectors 
with a sparse estimator might improve estimation accuracy and/or forecast performance as 
illustrated in the industrial production application. Third, in over-parametrized settings, 
where the number of time series is larger than the sample size, traditional cointegration tests 
can not even be computed. The sparse estimator can be used in these settings. 

There are several questions we did not address and which are left for future research. 
For instance, the models analyzed in this paper generally exclude deterministic terms (see 


e.g. Nielsen and Rahbek, 2000). An exception is the industrial production application where 
an intercept was included in the VECM. We also made abstraction of structural breaks. 
Allowing for structural breaks in the analysis is useful when analyzing macro-economic data 


(Johansen et ah, 2000). 

A natural extension of this study would be to implement structural analysis. Impulse- 
response functions, for instance, can be estimated using the sparse estimator. Confidence 
bound around the impulse-response functions are then obtained using a bootstrap proce¬ 
dure. Finally, similar ideas as introduced in this paper can be used to test for Granger 
Causality. Few studies consider Granger Causality in high-dimensional systems, an example 
is 


Jarocinski and Mackowiak (2013). An interesting path for future research is to use a sparse 


procedure to test for Granger Causality in high-dimensions. 
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Appendix A: Simulation designs 


Table 8: Low-dimensional (T = 500,(7 = 4) and high-dimensional (T = 50, g = 11) simulation 


designs. 




Appendix B: Time-series cross-validation 


We select the tuning parameters according to a time series cross-validation approach (Hyn- 


dman, 2014). Denote the response by z^. For the penalized multivariate regression in equa¬ 


tion ([^, zj = Ayt — nyt_i. For the penalized reduced rank regression in equation ([^, 

zt = Ayt - r^Ay^.i. 


1. For t = S',... ,T — 1 (with S such that 80% of the data is included in the first calibration 
sample), repeat: 

(a) For a grid of tuning parameters, fit the model to the data zi,..., z^. 

(b) Compute the one-step-ahead forecast error e^+i = zt+i — zt+i 


2. Select the value of the tuning parameter that minimizes the mean squared forecast error 


MSFE = 


1 1 


T-l q 


T-Sq^^ 

^ t=S i=l 


= « ' 
't+1 

(i) 


a 


with the component of the multivariate time series at time t and the standard 
deviation of the time series 


Appendix C: Bootstrap procedure for testing the zero- 
sum restriction 


To test the zero-sum restriction, we use the following bootstrap procedure (see Cavaliere 


et al., 2012). 


1. Take the cointegrating vector under the null hypothesis, see equation ( |17| ). Given (3^°, 

use the Sparse penalized ML algorithm (or Johansen’s approach) to estimate ..., 

together with the corresponding centered residuals it- 

2. Construct the bootstrap sample recursively from 

p-i 

Ayfo* = ^ ^ ff° Ay*_, + 

^=1 
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with starting values = yj,j = 1 — p,..., 0 and with bootstrap errors obtained using 
a residual bootstrap such that = iut with Ut.,t = 1,... ,T an i.i.d. sequence of discrete 
uniform distributions on {1,..., T}. 


3. Apply the Sparse penalized ML algorithm (or Johansen’s approach) to the bootstrap sample 


yf°*- 


4. 

5. 

6 . 


Construct the bootstrap estimates 0*' 


e\ 9*2 



with 9* = 



Compute the bootstrap statistic Q* = 0*'Cov ^(0*)0*. 

Check if Ylih=i > Q) " with Ql,b = 1,..., B B independent bootstrap statistics - 
exceeds a fixed significance level r]. If so, the null hypothesis Hq is not rejected. 


Appendix D: Industrial Production Time Series 


Table 9: Industrial production time series. Source: Federal Reserve, United States 


Variable 

Description 

TOT 

Total manufacturing 

NAICS 

NAICS industry manufacturing 

DURABLE 

Durable manufacturing 

WOOD 

Wood production 

NONMETAL 

Nonmetallic mineral production 

PRIM METAL 

Primary metal 

EABR METAL 

Fabricated metal 

MACHINERY 

Machinery 

COMPUTER 

Computer and Electronic product 

ELECRICAL 

Electrical equipment, appliance and component 

MOTOR 

Motor vehicles and parts 

AEROSPACE 

Aerospace and other miscellaneous transportation 

EURNITURE 

Furniture and related products 

OTHER DURABLE 

Miscellaneous durable manufacturing 

NONDURABLE 

Nondurable manufacturing 

EOOD 

Food, beverage and tobacco 

TEXTILE 

Textile and production mills 

APPAREL 

Nondurables, apparel and leather goods 

PAPER 

Paper 

PRINT 

Printing and related support activities 

PETROLEUM 

Petroleum and coal products 

CHEMICAL 

Chemical 

PLASTIC 

Plastics and rubber products 

NON-NAICS 

non-NAICS industry manufacturing 
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